{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ntGNDuSCeAR2"
   },
   "source": [
    "# FastEmbed on GPU\n",
    "\n",
    "As of version 0.2.7 FastEmbed supports GPU acceleration.\n",
    "\n",
    "This notebook covers the installation process and usage of fastembed on GPU.\n",
    "\n",
    "## Installation\n",
    "\n",
    "Fastembed depends on `onnxruntime` and inherits its scheme of GPU support.\n",
    "\n",
    "In order to use GPU with onnx models, you would need to have `onnxruntime-gpu` package, which substitutes all the `onnxruntime` functionality.\n",
    "Fastembed mimics this behavior and requires `fastembed-gpu` package to be installed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "GK2XADwUeEK7"
   },
   "outputs": [],
   "source": [
    "!pip install fastembed-gpu"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3aiGPqjCeGzo"
   },
   "source": [
    "**NOTE**: `onnxruntime-gpu` and `onnxruntime` can't be installed in the same environment. If you have `onnxruntime` installed, you would need to uninstall it before installing `onnxruntime-gpu`. Same is true for `fastembed` and `fastembed-gpu`.\n",
    "\n",
    "### CUDA 12.x support\n",
    "\n",
    "By default `onnxruntime-gpu` is shipped with CUDA 11.8 support.\n",
    "CUDA 12.x support requires installation of `onnxruntime-gpu` with providing of a direct url:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "OoSfWFFZeJ5t",
    "outputId": "417b9332-6a7b-4000-c74b-4ed2b5b76590"
   },
   "outputs": [],
   "source": [
    "!pip install onnxruntime-gpu -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/ -qq\n",
    "!pip install fastembed-gpu -qqq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3xx3r-9jgAMi"
   },
   "source": [
    "You can check your CUDA version using such commands as `nvidia-smi` or `nvcc --version`\n",
    "\n",
    "Google Colab notebooks have CUDA 12.x."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Igv5RXhSeO68"
   },
   "source": [
    "### CUDA drivers\n",
    "\n",
    "FastEmbed does not include CUDA drivers and CuDNN libraries.\n",
    "You would need to take care of the environment setup on your own.\n",
    "Dependencies required for the chosen onnxruntime version can be found [here](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Usage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 334,
     "referenced_widgets": [
      "aacf08a7aa444b64a2efad1967d28a53",
      "5606aa785de74d65a9928b31c0be8a53",
      "d4ec9d3b74ec4412894da2161ed2bddf",
      "8edd544c3e074ec1813e5b9d1aef43d9",
      "9898890f8a75468ea20e3ce319d0b6e2",
      "da3b18abb16241a0a7191ee9afcb0510",
      "258a619168824253a6a329efdc51ebe6",
      "53c7cdc967d24faba0b5c659c94c50b8",
      "e9348d8be28d408e8e760c71b21ab294",
      "0ba06e0816714f2fbdec8260f160abc0",
      "0b96563334964d449dd34f35b6b3e715",
      "11c2eec490e8479b944eec7f30cb1ca2",
      "91463da0d1c5466795e06ab586002259",
      "30f4f7833406474f89ef0700b00a33aa",
      "4302c304ec6a4b5985797e300bd7e353",
      "2605640c7b824ed7aa137d404e14b774",
      "b02efe3a33d04f06aa8938719ab35671",
      "50408e5d052343b1a1b44a0fae0f801d",
      "1be01c95d9e84f8ea88367c987a72fdc",
      "a109c13bc93a449186424542dc330be8",
      "4adce304ce1947b5a01dde10bbb3bb8c",
      "a761366a37e44837a25e0f25b18efed2",
      "94512b9055e546389471197b76ad5449",
      "072dca00bd7b4918a178f90ccabf698a",
      "48a856c59ef74cc3834521b1bf616541",
      "c020c503aeaa464cad643ade5ee3ae24",
      "a4e7e40c0bbd4f878c20a9f65fe3a048",
      "e8c0a1c339fd47668d944a9defad79d4",
      "cd782d35c6bd40c0a60d57b1828a7251",
      "04f638ab08da4d20928644c4ba03f8ef",
      "17f20477fc79475f97adf1c1f64a4192",
      "96f7b5a2e224462e9fcffd03f906a593",
      "755cd32d9fc9407c80a160f45c802d1e",
      "a886258e7cd14c048b58391d7b772901",
      "bc3e48f826a74840867a6209e622b75e",
      "125b2ac0f78043bba7eca53474ca44c4",
      "82f186d1ffb4435d94a6c7e9025242ef",
      "77000333e5ca4094be291ad82d4a627a",
      "7fe64fb53055431488d002c76c8e331e",
      "7de59ae9919f4a5bb2b6e601a3c02412",
      "97a69423a6644eab87fc636e182f23a4",
      "4df936d1065b41f4bf02ed394fdf7b7e",
      "3918bd1affa3454e8e9044a418a056ea",
      "163b27ae0bce41e5b48efcb4b3fd780d",
      "94631fd6e0744085bc79c3121de4a9f7",
      "31cd98d66bc54418b35e70fbbc0fa3c0",
      "6d21627a638b4ddca6fe7bfb80a621b5",
      "b37bed9dc4fe45c08b8397288fe5b1a9",
      "164fef95d1414177a40d563f5682f6a3",
      "1a9a0ea53448413a8e4b360b7bb69e26",
      "dd1a4483b4b045c6929e3d2cf1338f63",
      "496ddd8e05f949cd8cbba8e677f476ac",
      "2813be951d7f48b2aad1dd4a444ce3eb",
      "8e9a2c2dd21942edbdfecb3b7dffc70b",
      "08a10fe247f1425db044cfc13f2fb384",
      "b8786aded92d421592bc7623c5c7899e",
      "c91a20a9433d4016ba2db69fa50e0b4d",
      "e997820738594c6dadb061908d7afdc1",
      "a5fc751f81ae498f9aa55ece0e6853b2",
      "2aee4fc8cda64c5eb8722be81e48e0ca",
      "3a53e8624dff48b3959875ef58ee99ce",
      "50a70044f77542108fe188598e70797e",
      "13cf998b35ae4507a63e797f6fa3eada",
      "6209eb6a68cf4a378767ef34d0d9216d",
      "7395db766b944af9b41d6b56c9ada0b1",
      "42122c317ec648688f0164a1adb5df28"
     ]
    },
    "id": "Ttf4YggPeQQK",
    "outputId": "aa75129d-9e2d-4c88-cf03-251dd43a11b1"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning: \n",
      "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
      "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n",
      "You will be able to reuse this secret in all of your notebooks.\n",
      "Please note that authentication is recommended but still optional to access public models or datasets.\n",
      "  warnings.warn(\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "aacf08a7aa444b64a2efad1967d28a53",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "11c2eec490e8479b944eec7f30cb1ca2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "tokenizer_config.json:   0%|          | 0.00/1.24k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "94512b9055e546389471197b76ad5449",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "config.json:   0%|          | 0.00/706 [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "a886258e7cd14c048b58391d7b772901",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "special_tokens_map.json:   0%|          | 0.00/695 [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "94631fd6e0744085bc79c3121de4a9f7",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b8786aded92d421592bc7623c5c7899e",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "model_optimized.onnx:   0%|          | 0.00/66.5M [00:00<?, ?B/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "['CUDAExecutionProvider', 'CPUExecutionProvider']"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from typing import List\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "from fastembed import TextEmbedding\n",
    "\n",
    "embedding_model_gpu = TextEmbedding(\n",
    "    model_name=\"BAAI/bge-small-en-v1.5\", providers=[\"CUDAExecutionProvider\"]\n",
    ")\n",
    "embedding_model_gpu.model.model.get_providers()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "id": "iPtoHf7GeV-i"
   },
   "outputs": [],
   "source": [
    "documents: List[str] = list(np.repeat(\"Demonstrating GPU acceleration in fastembed\", 500))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "islhyLf4ed-H",
    "outputId": "8c8ed09b-9eac-438f-97bc-578751975148"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "43.4 ms ± 2.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "list(embedding_model_gpu.embed(documents))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 67,
     "referenced_widgets": [
      "9c306ce5188c45feb8dfb9089592591c",
      "296ff54c6e61441f978084df59626598",
      "d6d42b4f245a49b7ba7769e23a3202fc",
      "39ce7754480147759c16a3089d8105af",
      "8253960a069d4106863a75faae54b90d",
      "7ccf959452af4c0b873c7567747f0816",
      "ac9d0b5a5b1f401e90a1cc9ffe6d4b4c",
      "0aada067dec3472f9aba1772d6b775a5",
      "07597b1287e04653b80c47a771549376",
      "054be1dd9f084cae911745b692ccd929",
      "ab19e8e831694e308a4b79f05aff728e"
     ]
    },
    "id": "bOKVUvWJegYJ",
    "outputId": "dde74917-08b0-4ce2-9a2b-cc31e02cafb2"
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9c306ce5188c45feb8dfb9089592591c",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "['CPUExecutionProvider']"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "embedding_model_cpu = TextEmbedding(model_name=\"BAAI/bge-small-en-v1.5\")\n",
    "embedding_model_cpu.model.model.get_providers()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "0NJj9RvSfASP",
    "outputId": "526f5280-99bd-454e-8af8-6a860ad96e54"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4.33 s ± 591 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "list(embedding_model_cpu.embed(documents))"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "T4",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
